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Abstract 

Node localization algorithms that can be easily integrated into deployed wireless sensor networks (WSNs) 
and which run seamlessly with proprietary lower layer communication protocols running on off-the-shelf 
modules can help operators of large farms and orchards avoid the difficulty, cost and/or time involved with 
manual or satellite-based node localization techniques. Even though the state-of-the-art node localization 
algorithms can achieve low error rates using distributed techniques such as belief propagation (BP), 
they are not well suited to WSNs deployed for precision agriculture applications with large number of 
nodes, few number of landmarks and lack real time update capability. The algorithm proposed here is 
designed for applications such as pest control and irrigation in large farms and orchards where greater 
power efficiency and scalability are required but location accuracy requirements are less demanding. Our 
algorithm uses received signal strength indicator (RSSI) values to estimate the distribution of distance 
between nodes then updates the location probability mass function (pmf) of nodes in a distributed manner. 

At every time step, the most recently communicated path loss samples and location prior pmf received 
from neighbouring nodes is sufficient for nodes with unknown location to update their location pmf. This 
renders the algorithm recursive, hence results in lower computational complexity at each time step. We 
propose a particular realization of the method in which only one node multicasts at each time step and 
neighbouring nodes update their location pmf conditioned on all communicated samples over previous 
time steps. This is highly compatible with realistic WSN deployments, e.g., ZigBee which are based 
upon the ad hoc on-demand distance vector (AODV) where nodes flood route request (RREQ) and route 
reply (RREP) packets. Further, beacon signals transmitted during the network formation and routing table 
formulation stage can provide the RSSI information required by the localization algorithm. 

Index Terms 

Wireless sensor networks, distributed localization, range-based localization algorithms, path loss mea¬ 
surements, information aggregation, precision agriculture 
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I. INTRODUCTION 

With the advent of short range wireless technologies and standards in late 1990’s variety of wireless 
localization techniques for indoor and outdoor applications have been developed. Wide range of indoor 
localization techniques have emerged based on camera, infrared, wireless local area network (WLAN), 
ultra wide band (UWB), Bluetooth, and radio-frequency identification (RFID) (T| whereas global position¬ 
ing system (GPS) technology revolutionized outdoor localization. Even though GPS-based localization 
techniques are attractive in terms of accuracy, their impaired coverage in metropolitan environments 
and lack of cost-effective scalable solutions sparked emergence of IEEE802.15.4/ZigBee RSSI-based 
localization algorithms. These techniques have advantage over Bluetooth, UWB and Wi-Fi due to their 
energy efficiency and capability to support high-range communication and mesh networking ||2. 

Localization techniques have been developed for different types of applications and arc compared 
in terms of accuracy, coverage, cost, responsiveness and adaptiveness to environmental changes ||3Tl. HI. 
While some techniques such as laser and camera-based technologies are highly accurate and scalable in 
terms of coverage, they are usually too expensive to use for large environment applications. Particularly 
for large scale outdoor applications such as agricultural environments, a cost-effective, scalable and fast 
localization technique which is robust against seasonal environmental variations, e.g., growing season 
changes, is needed. On the other hand, accuracy requirements are usually looser because of relatively 
high inter-node distances which correspond to distance correlation of the measured features. 

One of the rapidly growing WSN areas for outdoor environments is precision agriculture which 
enhances crop management and yield through sophisticated management of soil, water resources and 
applied inputs @. WSNs are deployed to improve spatial data collection, precision irrigation, variable- 
rate technology and supplying data to farmers J6|. This requires sampling of critical features such as soil 
pH, moisture, electrical conductivity in addition to deployment of actuators to trigger wide variety of 
processes varying from drip irrigation to pest management, e.g., mating disruption. In order to provide 
meaningful feature maps that improve resource management and decision making, it is critical to be 
aware of location of the sensors that have generated data. Loose accuracy requirements, beside the cost 
involved with equipping all sensors with GPS, raise the need for localization algorithms which are low 
cost, and are compatible with commercial off-the-shelf (COTS) transceiver modules. 

Anchor-based localization algorithms make use of landmarks or anchor nodes to help localizing 
unknown nodes ||7j] and are divided into range-based and range-free techniques. Range-free algorithms 
on the other hand, only take advantage of the connectivity information @, i.e., whether nodes are within 
the communication range of each other whereas range-based algorithms exploit time of arrival (TOA), 
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angle of arrival (AOA) or RSSI to estimate the distance between nodes, so called inter-node distances. 
RSSI-based techniques are attractive in the sense that no additional hardware is required in order to make 
the distance estimation [j9j- Further even though AOA and TOA-based techniques are more precise, they 
are more complex in the sense that the former requires multiple antennas to detect signal arriving from 
different directions whereas the latter demands a large bandwidth for better multi path resolution. 

This work is a probabilistic distributed and range-based localization technique for static WSNs based 
on RSSI samples, Bayesian model for information aggregation and particularly suited to precision agricul¬ 
ture applications. Most of the probabilistic distributed localization techniques work based on marginaliza¬ 
tion over a Markov random held (MRF) where joint distribution of nodes location {aq, aq, ■ ■ ■ x n } based 
on noisy distance measurements between pairs of nodes {dij} is expressed as multiplication of node and 
pairwise potentials, P(aq,..., x n \{dij}) oc ][ P ( d t] \x ,, Xj )0/ ’ ( Xj) flOl . Message passing algorithms 
such as belief propagation (BP), nonparametric belief propagation (NBP) and their variants are proposed 
to estimate the marginalization, hence location of each node fflOl . ifTTl . fl2l . fl3ll . BP-based techniques 
are vulnerable to loopy graphs which cause them either not to converge at all or converge only under 
specific circumstances in terms of number of loops fl4l . Therefore these techniques have been mostly 
used for the scenarios where a few slowly moving or static nodes along with relatively high number 
of anchors, and all equipped with short range transmitters, render the statistical graph spanning tree or 
have few number of loops. Another shortcoming of these techniques is the need for global information 
from distance measurements to be available so that statistical graph is formed and algorithm could 
start to run. These two reasons lead to the fact that even though a relatively high accuracy is achieved 
with these techniques, remarkable amount of communication overhead, at least 0(n) depending on the 
technique, is required to form the spanning tree or statistical graph using multi-hop communications. 
The second issue is addressed in lfl5l . where nodes only exchange information with their single hop 
neighbours, however the communication and computation overhead required for making spanning trees 
with landmarks designated as root and other nodes keeping track of paths still holds since the procedure 
demands for independence of paths that arrive at the updating node. In contrast, in precision agriculture 
applications, relatively high number of connected unknown nodes resulted from high transmit power 
level, and underlying IEEE802.15.4 WSNs which work in conjunction with route discovery phase of 
AODV, call for a real-time algorithm which relies on local single hop information and is not susceptible 
to loops in the network. 

Our work is similar to IT5l in the sense that nodes only communicate with their single hop 
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neighbours and update their location in a real time manner rather than having to make the statistical 
graph using multi hop communications as in MRF-based approaches. However our algorithm needs no 
initialization in terms of spanning tree construction or having to start from a specific node or landmark 
in the field. In other words, the proposed technique is well positioned to address self-localization in do 
it yourself (DIY) networks which run ZigBee or other proprietary mesh networking protocols on top of 
IEEE802.15.4 specifications. The reason behind this is that the algorithm starts to work in conjunction 
with route discovery phase of AODV-based routing protocols such as ZigBee where route request packet 
(RREQ) originated from an arbitrary source node is flooded in the entire network. We derive a closed-form 
recursive relationship for Bayesian update of nodes location at a time step during which one or multiple 
path loss samples are generated therefore call it a Bayesian model for information aggregation. We prove 
that the location constraint resulted from a generated path loss sample is in fact convolution of path 
loss likelihood and the most recent location estimation of the generating node. Realistic independence 
assumptions, resulted from our measurements, arc made to prove that location constraints resulted from 
dependent paths (loop forming paths) multiply. This makes the algorithm faster by eliminating spanning 
tree construction, intermediate node tracking, and also making use of constraints resulted from the paths 
traversed by flooding RREQ packets, whereas algorithm’s robustness against loops is verified by extensive 
simulations. 

Since our goal is to devise an algorithm that can work in conjunction with COTS transceiver 
modules, we characterize path loss at 2.45 GHz industrial, scientific and medical (ISM) band. Based 
on our measurements in apple orchards, log-normal path loss model is proposed for high density apple 
orchards and for different transmitter (Tx) and receiver (Rx) antenna heights. Further, Rx was placed 
below tree height whereas Tx was fixed below and above the tree height. In the rest of this paper, these 
two antenna height modes are called below and above canopy level respectively. The path loss data was 
collected during three measurement campaigns throughout two consecutive summer seasons. 

The remainder of this paper is organized as follows: In Section [III we formulate the localization 
problem, define the notations, include a brief summary from our measurement campaigns and explain the 
path loss model along with path loss likelihood function conditioned on node locations. In Section [HI] 
we devise a recursive solution to the problem stated in Section |TT] and propose a specific implementation 
of this solution based on nodes multicasting in TDMA manner. Finally we proceed with simulations and 
evaluation of our algorithm in Section |TV] and wrap up the paper with conclusion in Section [V] 
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II. The Localization Problem and Path Loss Likelihood Function 

As stated in Introduction, pinpoint localization accuracy is not required for precision agriculture applica¬ 
tions such as pest control since knowing approximate location of originating sensors suffices to trigger the 
relevant actuators. Accordingly, we define the localization problem in a discrete manner which means that 
the agricultural field is divided into smaller square cells and location of each unknown node is determined 
as centroid of one of the cells the field is divided into. The precision of the algorithm is adjustable via 
number of grid cells inside the field, however precision flattens once grid resolution exceeds a threshold. 
Formulation of the localization problem based on aggregated path loss samples from neighbour nodes is 
discussed in Section III-AI and path loss model for orchard environments is explained briefly in Section 

GEE 

A. Problem Formulation 

Let S = {Si,..., ,Sj\-} be a set of sensors randomly scattered in a square field which is divided into 
m x m square cells with equal areas, and ft = {1,2,..., m 2 } be the sample space of all possible cell 
coordinates. Our objective is to make use of inter-node communications and find the grid cell each node 
is located in. In the following, we introduce the notations and formalize the localization problem. 

Without loss of generality, let the first n a nodes be landmarks Si = {Si,... , S Ua }, and unknown 
nodes be represented by S u = {S na +i,..., .S'.v} while y\- is a path loss sample or average of multiple 
path loss samples that Sj collects from ,Sj; at Zth time step. Note that in general, multiple samples could 
be collected in case each calculation time step is made up of multiple communication time slots. Let (-)/,. 
denote vector of path loss samples which have been communicated between pairs of connected nodes 
during the first k time steps and let Y> represent vector of all path loss samples that Sj has collected 
from its neighbour nodes with index set Nj at fc-th time step, 

Qfc — (Vij ) l = 0 : k 

< i <j<N,ieN r ( 1 ) 

. rf = 

Note that ■ is not available in case Sj has not collected any sample from S m at k-th time step. 
Let Xj k) be a random variable defined over ft representing location estimation of Sj at fc-th time step. 
Considering that we are looking to estimate location of Sj at Mth time step based on previous aggregated 
data Q m , 

Xj = argmax[P(x! M) = Xj\Q M )], (2) 

Xj 
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where P(-) is the probability function and argmax[/(x)] is the set of points x for which f(x) attains its 

X 

largest value. In the remainder of this section, path loss model for agricultural environment which is the 
key to generate y\ - samples, 0/, : and is explained. Consequently we derive the path loss likelihood 
function that underpins the recursive algorithm described in Section [TIT] Moreover, we derive likelihood 
of y[- given that ,5) and S :j are estimated to be located at x % and Xj respectively, i.e., P(y^|Xj = 

Xj,X ; (k) = Xi). 

B. A Representative Path Loss Model for Orchard Environments 

In this section, we explain the path loss model resulted from our measurement campaigns in apple orchards 
located at Keremeos, BC, Canada. This underlies the work in Section III-CI which explains derivation of 
path loss likelihood function expressing path loss distribution conditioned on Tx and Rx locations. 
There is an extensive literature on path loss models for forests and agricultural environments. It is claimed 
that log-distance path loss model provides a good lit to the measured path loss in vegetated environments 

m, 03 , 03 , 

PL[dB] = PLq + lOnlogA + X a , (3) 

«o 

where X a is a zero-mean normal random variable with standard deviation a, X a ~ A r (0. a), whereas 
PLq represents path loss at reference distance do and n denotes path loss exponent for the specific case 
of study. 

We carried out the measurements in Dawson orchards at Keremeos, Okanagan, British Columbia. 
Measurements were conducted in a 6 hectare (ha) orchard consisting of apple tree rows divided into 
standard and high density in terms of vegetation and canopy density with trees being approximately 
3 m high. We use the path loss data collected from four directions of along, cross, 30°, 45° and 60° 
with respect to tree rows, using different transmitter (Tx) and receiver (Rx) antenna heights. Further, we 
conducted measurements with Tx at 2.5 m (below canopy level) and 4 m (above canopy level) heights 
and Rx at 2.5 m. This setup is compatible with realistic WSN deployment scenarios where gateways, 
responsible for aggregating data of their neighbouring sensors, are mounted above canopy whereas sensors 
and actuators are placed inside the canopy. As localization is concerned, gateways which have better line 
of sight (LOS) are equipped with GPS to play the landmark role. The measurements were conducted 
throughout three different measurement campaigns, seven days combined and spread across two summer 
seasons. 

Measurements were done in approximate range of 0-100 m at points which are approximately 10 m 
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apart from each other at 9 different parts of the orchard along four directions illustrated in Figure Q] 
Our equipment on the transmitter side, are an Agilent E8267D vector signal generator (VSG) feeding a 
2.45 GHz omnidirectional dipole antenna with 5 multi-tones (5 MHz apart from each other) through a 
ZVA-213 power amplifier which provides +23 dBm as the antenna input. Whereas on the receiver side, 
a Toshiba laptop which runs MATLAB and Agilent connection expert, specialized proprietary software 
for connecting computer to Agilent spectrum analyzer, is connected to a N9342C handheld spectrum 
analyzer (HSA) via a LAN cable. Extra losses and gains resulted from cables, connectors and antennas 
at both Tx and Rx sides have been taken into account for calibration. 


• • • # r x m^lerrftit 



::|g:j£ 


50m 




1A Cross track link 

• ••••• • 


(a) 


(b) 


Fig. 1: 9 measurement scenarios inside the orchard is illustrated; Transmitter antenna was moved 50 m across the 
rows to form a new scenario whereas Rx was moved along four different directions of along, cross, 30°, 45° and 
60° for each scenario and path loss samples were collected through 0-100 m range and at « 10 m apart points. 
Rx antenna was placed at 2.5 m elevation (0.5 m below tree height) while Tx antenna height was at 2.5 m and 

4 m elevation (1 m above canopy level). 


The summary of path loss statistics along with statistical measure R 2 , which indicates how well 
data tits the log-distance model, and 95% confidence interval (Cl) for PLq and n are expressed in Table 
ID whereas path loss samples for two modes are illustrated in Figure [3] Note that gateway-to-node and 
node-to-node communications comply with above and below canopy level Tx modes respectively. 

C. Path Loss Likelihood Function 

In this part, we derive likelihood function P(yfj |Xj k ^ = Xj, x[ k ^ = x,J which is a key component of 
the algorithm we propose in the next section since it relates path loss values to inter-node distances. 
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TABLE I: Path Loss Model characteristics for above and below canopy level modes 


Mode 

n 

PL 0 [dB] 

cr[dB] 

R 2 

95% Cl for n 

95% Cl for PL 0 

2.45 GHz-Tx below canopy level 

3.61 

75 

5.27 

0.74 

3.36-3.86 

71-79 

2.45 GHz-Tx above canopy level 

2.91 

72 

4.14 

0.78 

2.60-3.22 

67-77 


m cells 
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Fig. 2: Location pmf of unknown nodes is updated recursively. Agricultural field is divided into m x m cells with 
equal area and probability of an unknown node being located inside each cell is calculated based on recently 
aggregated path loss samples and prior location pmf of connected nodes. 


Path loss comparison for below and above canopy Tx scenarios 



Fig. 3: Path loss samples for below and above canopy Tx level at 2.45 GHz collected from three measurement 
campaigns; The difference between the two above and below canopy modes, which is due to more line of sight 
(LOS) between Tx and Rx in the above canopy case, could be seen. 
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Assuming log-distance path loss model as discussed in Section III-BI and taking a random point on the 
field into account, the probability of path loss sample y-'j falling in the range \j)l, 3 — ^,pkj + y] with 
A << plij and when Sj is located at distance d l3 from S, is calculated by 


p [ P l ij - y < Vij < Pkj + y 


D = dij I = 


C A 


V2 


(pl i:j -PL(d i:l )) 

3^2 - 


(4) 


7T(J 


where PL(d) = PLq + lOnlog(^) and C is the normalization constant. Based on ©, and the fact 
that each pair ( Xi,Xj ) translates into the corresponding distance dij, sensor S 3 calculates P(yf 3 = 
plij |Xj lJ = Xj . Xj 1 11 = Xi), V:/;,;, x 3 € fh Further in practice, in order to approximate the above 
conditional probability, we collect amplitude of the normal distribution with mean PL{dij) and standard 
deviation a in the range PL(dij) — 3a, PL(dij) + 3a at 1 dB steps and normalize the values so that 
they sum up to one. Note that the proposed path loss model in Section III-BI is used to derive the path 
loss likelihood function and also to generate random path loss samples in our simulations in Section [TV] 


III. Localization Algorithm For Precision Agriculture Applications 

In this section, we derive an algorithm for the problem stated in ([2]) which works based on Bayesian model 
for information aggregation. Therefore, our objective is to derive a recursive expression for P(Xj k ^ = 
Xj\@k) that explains how location pmf is updated once information is aggregating in the network or in 
other words, the most recent evidence, RSSI sample, is collected. In Section IITI-AI we first solve the 
problem for general case where at each calculation time step, arbitrary amount of information or number 
of packets, between one or multiple pairs of nodes is exchanged. In Section ITIT-AI we proceed with the 
special case which is more compatible with route discovery phase of AODV-based routing protocols such 
as ZigBee. This is the algorithm we have simulated in Section HVl 


A. General Case 

According to the notation explanation in Section |TT] and assuming that at each time step, ,Sj updates its 
location pmf only based on the samples it has received from single hop neighbours, i.e., not samples 
communicated between other pairs of nodes, 

P(xj k) = Xj 10 fc ) = P(xj k) = Xj 10 fc _i, rj k) ). (5) 

Based on the fact that 0fc_i X 

P(xf } = Xj\@ k -i,Y} k} ) oc :( |xj k) = Xj, @fc_i)P(Xj k) = sj|0 fc _i). (6) 
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Let us recall that in general each calculation time step could be made up of several communication time 

(k) 

slots therefore we have used Y- which are the path loss samples Sj collects from one neighbour or a 
set of neighbours at k -th time step. Rephrasing (0 yields the recursive form, 

P(xj k) = Xj \Q k ) oc P(y. (fe) |xj k) = Xj , 0 fc _i)P(x| k) = ^|0 fc _i). (7) 

We then simplify P(l^ A> |Xj k) = Xj,Q k - 1 ) in the right-hand side of ((7]). Letting X denote statistical 
independence and assuming that 

y% X (y k mj |xf 1 ,0 fc _i) Vi, m € N j , (8) 


First term on the right-hand side of 0 could be written as 


P(Y} k) |xj k) = Xj , @&— i) = H P(yfj |X< k) = Xj ,& k - 1 ). (9) 

*ew 

Our measurements followed by the procedure in |[T9l verify the assumption in ©. Further our measure¬ 
ments show that shadowing correlation between links in the vegetated environment, which is the case of 
our study, is very low (below 0.1). This is reasonable due to long links we are dealing with which are 
~ 50 m for pest management applications. Due to lack of space and irrelevance to the main topic, we 
spare reader details on shadowing correlation calculation. 

Based on conditional expectation rule, we simplify the right-hand side of ©, 


P(y k ij |xj k) = Sj.Ofc-i) = ^P(^|xf } = Xj , x[ k) = x i , 0 fc _ 1 )P(xS k) 

Xi 

= ^p(:4|xf } = x„x| k) = Xj)p(x[ k) = xi | ©fc—i). 

Xi 



■Ej i ©fc—l) 


( 10 ) 

In m . we use the assumption that x| k) X (Xj k '|0fc_i) and X (0^_ 1 1 x\ k 1 . Xj k *]. The first 
assumption results from the fact that given all the previous aggregated information in the network, update 
on location of Si at each time step is independent of that of Sj. Whereas the second assumption indicates 
that given the most recent updates on Si and Sj, the path loss between S t and Sj is independent of the 
previously aggregated data in the network. 

Combining ([9]) and ( fTOl ) yields 


P(Y, 


(k) 


= Xj,@ k - r) = n E [ p (?4l x j k) = *i, x i k) = ^)^(xS k) = X. 


- (k) 


;(k) 


|0f _1) )]- (ID 


ieN-j Xi 
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Finally combining (|T|) and (fTTb completes the recursive update, 

P(Xj = Xj\@ k ) oc P(Xj = Xj\9 k -i)x 

n E [n4l^j k) =a*Xi k) = ®i)P(xJ k) =si|0 fc _i)]. (12) 

ieNj Xi 

This means in order to update posterior of Sj after observation of new samples collected from Si, we 
need to know priors of S, and Sj in addition to channel information P(y^|X- k) = Xj, Xj kj = x t ). With 
respect to total number of nodes n, the algorithm has computational and communication complexity of 
0(n) and 0(1) per node which renders the algorithm scalable. The computational complexity is the same 
as BP-based techniques whereas communication overhead which makes up most of power consumption 
in WSNs is significantly lower since each node only communicates with its single-hop neighbours. 

B. Localization Algorithm Compatible with Wireless Sensor Networks 

In this section, we proceed with a realization of the general case algorithm which is a more specific case 
of the proposed recursive solution in (fT2l) . Moreover we assume that at k -th time step, only Sj. does the 
multicasting and all connected nodes update their location posterior based on the observed path loss or 
mean of the path loss samples, i.e., Y- = .. This means each node is recipient of at most one sample 

at a single time step which guarantees compatibility with real world deployment of WSNs such as TDMA 
or carrier sense multiple access with collision avoidance (CSMA/CA) where at each time slot, a node 
can listen to at most one neighbour node without interference. To be more specific, AODV which is the 
underlying routing protocol in ZigBee works based on flooding and multicasting route request (RREQ) 
packets and receiving routing reply (RREP) messages, hence our proposed localization algorithm can be 
integrated in a convenient and inexpensive manner. 

Off-the-shelf IEEE802.15.4 compliant modules such as TelosB, MICAz and Synapse modules give 
firmware engineers and designers the option to program them via Universal Serial Bus (USB), universal 
asynchronous receiver/transmitter (UART) ports or over-the-air (OTA). Even though MICAz and TelosB 
motes are widely used for academic and research purposes, Synapse modules which are equipped with 
light and fast network operating system, SNAP, and a more powerful microcontroller are more frequent 
for outdoor and industrial applications and better suited to more complex programming (with Python) 
and also mesh networking. In the next section, we use numerical examples to evaluate the performance 
of our algorithm based on radio characteristics of Synapse radio frequency (RF) modules. 

Quantization and Compression: There are limitations in terms of maximum payload size 
(102 bytes) which is imposed by underlying PHY and MAC layers. This limits us in terms of resolution 


September 9, 2015 


DRAFT 


12 


of the exchanged pmf messages in the network and may prevent the localization algorithm from achieving 
the desired accuracy in large orchards. Therefore, there is a trade-off between localization accuracy and 
excessive power consumption in addition to delay which are caused by exchange of multiple packets 
between a pair of nodes for the sake of transferring the entire pmf message. Our simulations show that 
quantization and compression techniques are applicable so that pmf messages with more bins lit in a 
single packet. Discrete cosine transform (DCT), and 6-bit quantization help achieve compression ratio of 
up to 8/1 which translates to coverage of a 100 hectare (ha) orchard for high node density (7 nodes/ha) 
pest management (mating disruption) application. 

Path Loss Model Auto-Tuning: So far we have assumed that there is a global awareness of 
path loss model among sensors, however this is not a realistic assumption due to remarkable changes 
during seasonal environmental variations. In 1201 . Mao et al. proposed a path loss exponent estimation 
method based on Cayley-Menger determinant technique and pattern matching. The technique estimates 
path loss exponent with a high accuracy (« ±0.2) for the same landmark scenario that we have used 
in Section [TV] i.e., landmarks deployed in the corners of the field, with estimation errors illustrated in 
Figures [4al [4b] Location estimation error could be tolerated for pest management applications for which 
the inter-node distance is 40 m-60 m. 

Precision Agriculture Accuracy Requirements: Coverage area of the sensors, spatial correlation 
of the measured features and required distance between actuators determine inter-node distance for 
deterministic grid WSN deployments. Further, inter-node distance could vary from 10 m for soil moisture 
lf2Tl and electrical conductivity Ii22ll . to coarser resolutions, 60 m for pFl sensing 1231 or mating disruption 
applications 11241 . As will be seen in Section [[V] our algorithm is mostly suited to pest management and 
mating disruption applications where tolerance for error which could result from the algorithm simplifying 
assumptions or mistuned path loss model. 

IV. Performance Evaluation of The Localization Algorithm 

In this section, we present the simulation results regarding performance of our localization scheme. We 
do the simulations for both random and deterministic (grid) deployment of WSN on a square field. We 
particularly use simulations to show that the average number of unknown nodes and landmarks each node 
connects to, affect the accuracy of the localization algorithm for a specific landmark arrangement. Flence, 
we define two parameters, so called average landmark degree and average unknown node degree. Let 
landmark and unknown node degree of an arbitrary node Si be the number of landmark and unknown 
nodes Si is connected to. Note that node degree in graph theory is strongly related to connectivity in 
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Fig. 4: Path loss exponent and localization estimation error claimed by Mao et al. Il20ll for randomly scattered 

nodes and landmarks in the corners. 


the communications context. Further, average unknown node degree depends on deployment density and 
transmit power level of unknown nodes whereas transmit power of landmarks, location of the landmarks 
and number of them affect the landmark average degree. Different metrics have been used to evaluate 
performance of the localization algorithms ll25l . We use Twice the Distance Root Mean Square (2DRMS) 
as the accuracy metric for our localization technique where 2DRMS=r means there is 95% confidence 
that the location estimation would fall within a circle with radius r around the actual node’s location. 
Note that location estimation itself is a random variable due to random nature of path loss samples, 
and generating source node. This is due to event-driven data delivery model which is normally used 
for precision agriculture applications which means that a sensor transmits data only when a feature 
exceeds a predetermined threshold, hence message passing schedule is different after landmarks advertise 
themselves. The random nature of the problem makes 2DRMS a suitable accuracy metric. 

In this work, we do not concentrate on optimizing landmarks location however in the next section 
we explain the logic behind our adopted landmark arrangement. In the remainder of this section, first we 
explain the simulation setup and assumptions. We will then proceed with numerical examples to evaluate 
the performance of our algorithm. 
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Fig. 5: A schematic view of ZigBee route discovery phase with RREQ packet from source node (orange) being 
flooded in the network to reach landmarks (dotted paths) and RREP packets returning to the source node (solid 
arrows). Localization could be done both in conjunction with RREQ flooding or RREP return phase. Each 
number shows the number of times RREQ packet has been multicasted. 


A. Methodology 

In this section, it is first explained why we opt for placing landmarks in the corner or middle of border 
lines, and continue with justifying assumptions regarding adopted transmit power, orchard size and node 
density. For precision agriculture applications inside farms, gateways are placed on the corners and borders 
of the field, however in the following we provide some logic on why this helps towards the improvement 
of localization algorithm. 

Landmark Arrangement: Even though placing landmarks close to each other and at the centre 
of the field yields a higher average landmark degree, the localization accuracy drops dramatically since 
their path loss behaviour has a very high correlation at a given direction and the path loss sample we 
collect from them is fairly close to each other at a specific point of reach. Moreover we place landmarks 
on the middle of borderlines or in the corners since the arrangement provides more information about 
unknown node’s location. In Figure |6j for a random unknown node location, it can be seen that having a 
more landmark degree does not necessarily result in a better location estimation. This is because distances 
in Figure [6a] are fairly close to each other and given that a noisy estimation of them are made based 
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on path loss samples, the location estimation will be far less accurate compared to the arrangement in 
Figure [60 It can be easily shown that this scenario holds for most points on the field. Studying other 
landmark arrangements could be done accordingly, however we avoid to elaborate on it for the sake of 
space considerations and since it does not add to evaluation of the algorithm and is therefore beyond the 
scope of this work. 



(a) Landmarks placed in the middle with every one of 
them having line of sight to the unknown node 



(b) Landmarks placed on the borders with only two of 
them having line of sight to the unknown node 


Fig. 6: Two different landmark arrangements; The landmark arrangement in plot [60 provides more information 
about location of the unknown node despite having fewer nodes having line of sight to the unknown node 


TABLE II: Deployment Scenarios 


Orchard size 6 ha, 20 ha 
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7 node/ha grid deployment (40 m internode distance) along with 6 landmarks 



Distance in cross track direction [m] 



(a) 6 landmarks (b) 8 landmarks 

Fig. 7: Two different landmark arrangements; unknown nodes and landmarks are demonstrated with green small 
dots and red large diamonds respectively. Location pmf for the designated unknown node (purple) is illustrated by 

heat map. 


Localization error 


£ 

05 

IT 

Q 

CM 



Fig. 8: 2DRMS with respect to average landmark and unknown node degree is depicted. Surface points are 

collected from all deployment scenarios 


Deployment Scenarios and Assumptions: In our simulation setup which is summarized in 
Table |TTJ we adopt two different orchard sizes of 6 and 20 hectares (ha) with nodes randomly scattered 
inside the field at two different densities, 3 nodes/ha, and 7 nodes/ha. As discussed in Section [DTI these are 
the densities used for pest management applications and translate to 60 m and 40 m inter-node distance 
for grid deployment respectively. Grid cell dimension is chosen to be 30 m so that both these densities 
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Average unknwon degree behaviour with transmit power level 



Fig. 9: average unknown node degree of nodes with respect to transmit power level of the unknown nodes. Dotted 
and solid graphs represent deployment scenarios for 6 ha and 20 ha orchards respectively. 


could be covered. The average size of an apple orchard varies from 1 to 20 ha in different regions, 
whereas the average size in Canada and the United States is approximately 6 ha and 20 ha respectively 
according to the United States Department of Agriculture Ii26ll . Node density and type of deployed RF 
modules may vary based on the precision agriculture application and required sampling range li27l . We 
also adopt four landmark arrangements with transmit power level of unknown nodes varying from 0 to 
+15 dBm, receiver sensitivity for packet error rate (PER) to be —103 dBm, whereas the communication 
between landmark and nodes occurs at maximum transmit power (+15 dBm). Variation of landmark 
degree for different number of landmarks and orchard sizes is also expressed in Table [IT] which are based 
on the assumption that Synapse RF200 modules are used 1281 . 

We also assume that landmarks (gateways) and unknown nodes (sensors) are mounted above 
and below canopy level respectively. We call S) and Sj connected, dij < d connec u V it y , in case the 
probability of RSSI falling below receiver sensitivity is below 1% or connectivity probability is above 
99%. This maximum transmission distance is calculated based on our measurement-based path loss 
model summarized in Table [I] In Table |TI] we have tabulated the transmission distance of Synapse RF200 
module at its maximum transmit power so that connectivity requirement is met j28l . In the next section 
we evaluate the performance of our algorithm. 

B. Results 

In this section, we study the localization error of our algorithm for different simulation scenarios. In 
Figure [7] two landmark arrangements, 6 and 8, along with 150 deterministically and randomly scattered 
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2DRMS for 3 node/ha grid deployment and different landmark arrangements 



2DRMS for 3 node/ha random deployment and different landmark arrangements 



(a) Low density grid deployment 


(b) Low density random deployment 


2DRMS for 7 node/ha grid deployment and different landmark arrangements 



2DRMS for 7 node/ha random deployment and different landmark arrangements 
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(c) High density grid deployment 


(d) High density random deployment 


Fig. 10: 2DRMS for different node densities and landmark arrangements inside a 20 ha apple orchard; Low and 
high density grid deployments translate to 60 m and 40 m internode distance respectively and is well-suited to 

mating disruption. 


sensors and maximum transmit power are illustrated. Location distribution of one designated node (purple 
node) after the algorithm converges is illustrated. 

In Figure [8j we illustrate the behaviour of 2DRMS with respect to average landmark and unknown 
node degree. As can be seen in the surface plot, error drops dramatically with average unknown node 
degree increasing. Further, even for a low average landmark degrees, ~ 1.5, an approximate average 
unknown node degree of 8 yields the desired 2DRMS (~ 20 m). In Figure |9j we demonstrate how average 
unknown node degree increases with transmit power level of unknown nodes in different simulation setups. 


September 9, 2015 


DRAFT 













































19 


2DRMS with respect to transmit power level of unknown nodes (3 node/ha) 



2DRMS with respect to transmit power level of unknown nodes (7 node/ha) 



(a) (b) 

Fig. 11: 2DRMS variations with transmit power level Ptx\ different scenarios in terms of node density and 
number of landmarks inside a 20 ha apple orchard are illustrated. The 2 landmark scenario is excluded for the 
sake of clarity and lack of space since it achieves a fairly low accuracy. Increasing node density helps towards 
achieving low 2DRMS with lower number of landmarks and transmit power 


These two figures provide an insight on how algorithm works with different transmit power levels. 

In Figure [TOj 2DRMS behaviour for different simulation setups during course of the algorithm 
is demonstrated which shows that the algorithm converges after a few messages are multicasted in 
the network. As explained in Algorithm Q] the procedure starts with landmarks advertising themselves 
to the entire network. This significantly helps towards faster convergence of the algorithm since one- 
hop neighbours of landmarks achieve a narrower pmf estimation at the first round. As could be seen 
in the Figure, generally 6 and 8 landmark/gateway scenarios meet the accuracy requirement for pest 
management, however in order to make the algorithm work for soil moisture sensing, number of landmarks 
or their maximum transmit power needs to increase. In other words, our simulations showed that a finer 
pmf resolution does not affect the accuracy in case cell dimension already supports the application in 
terms of inter-node distance. We also observed that the total number of messages needed for algorithm 
to converge grows slower than O(n) which is a promising aspect from the scalability stand of view. 
Moreover, in spanning tree variants of BP-based techniques, at least O(n) messages are required to make 
the spanning tree and after that every sensor needs to do a multicast at each iteration with algorithm 
taking anywhere between 1 to 3 iterations to converge. This means our algorithm is faster and consumes 
less communication energy to converge at the expense of accuracy. 

In Figure |TT] localization error for a 20 ha orchard, 40 m and 60 m inter-node distances, with 
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respect to transmit power level is depicted. Node density has higher influence at low transmit power 
levels which is compatible with our observations from Figure [8] Once transmit power increases, at a 
fixed landmark degree, average unknown node degree exceeds the required threshold and error drops 
to minimum. Based on the work in lf20ll and our simulations, the algorithm meets pest management 
(mating disruption) requirements with acceptable probability (above 90%) inside a 20 ha orchard with 8 
landmarks and all unknown nodes running on Synapse RF200 modules, however a different transceiver 
module may demand for different landmark setups since the maximum transmit power level would be 
different. More landmarks are needed in larger orchards in order to meet the average landmark degree. 

V. Conclusion 

Connectivity to landmarks in static WSNs deployed in large agricultural environments such as farms and 
orchards is limited due to excessive path loss and large size of the field. Besides, large number of nodes in 
the field and nature of higher layer communication algorithms in terms of transmit power and multicasting 
make connectivity graph for these WSNs very loopy. Most existing localization algorithms are ill-suited 
for use in such environments because they are overly complex, susceptible to loopy connectivity graphs, 
and incapable of real time updates, i.e., all the inter-node distance estimations must be completed before 
the algorithm runs. 

Our scalable RSSI-based localization algorithm overcomes these limitations by: 

1) using only local distance estimates with respect to neighbouring nodes, 

2) a small number of landmarks compared to total number of nodes, 

3) adopting coarser or finer grid of the field based on the application and available processing power at 
microcontroller of the transceiver modules and desired localization accuracy for a specific precision 
agriculture application. 

The algorithm uses a Bayesian model for information aggregation to achieve scalable communication 
and computational complexity with respect to the number of nodes. The computational burden of the 
algorithm is divided between nodes and time steps. Besides, the algorithm could be stopped at any time 
step to carry out the decision making on the location of nodes. 

The main strength of our localization algorithm is its compatibility with realistic deployment scenarios of 
WSNs and the low communication overhead it adds to the already deployed routing protocols. Further, 
the route discovery phase of ad hoc on-demand distance vector (AODV) routing protocols, e.g., ZigBee 
and similar schemes, work based on flooding and multicasting route request (RREQ) packets; hence our 
proposed localization algorithm can be integrated in a convenient and inexpensive manner. 
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Algorithm 1 Localization Algorithm For Agricultural Environments 
Step 1 : Initialization (path loss model auto-tuning if required) 

For i = 1 ,,n a initializing landmarks locations 

. -P(xj 0) ) ~ $ Xi (x) 


For * = n a + 1, • • ■, n t 


P(X 


( 0 ), 


i 

m 2 


initializing unknown nodes locations 


Step 2: Fandmarks advertise themselves to unknown nodes 
For i = 1,... ,n a 


For Vj € Ni 


Pfrf^Qi) = P(Xj 1 -^|© i _ 1 )P(yL|Xj 1 -^ = Xj,^^ = 


- (i—1) 


(i-1) 


- (i—1) 


Normalize P(X.j^|0j) 

Multicasting and updating with (fl2l) continue till all unknown nodes are covered for each landmark 


advertisement. 


Step 3: A random node Si becomes source and multicasts RREQ packet 

Step 4: 

For j = n a + 1,... ,N 


• If dij < d connec ti v ity , Jj d i 

- Updating rule (fl2l) 

- Normalization 

- Sj forwards and multicasts the RREQ packet if hop count allows (AODV) 

• else 

- P(Xj l) = Xj\Qi) = P(Xj 1 11 = Xj\@i-i) no change in location estimation 
While RREQ packet has not reached the landmark 

if- V j g Ni 

Redo step 4 

Step 5 : Fandmarks return the RREP packet over the minimum hop route towards source 
For V consecutive pairs of (i,j) on landmark-source route 

. P(xf } |0 4 ) = P(xf _1) lei-OP^-lxf- 15 = = Xi) 

• Normalize P(Xj 1 ' ) 10Q 

• else 

- P(xf |0 i ) = P(xf- 1) |0 i _ 1 ) 

Go back to Step 3 

Step 5 : Decision making after M time steps 
For j = n a + 1..... A 

• Xj = argmax[P(Xj M ^ = xJ0jy)]- 

Xj 
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